Detecting Non-compositional MWE Components using Wiktionary
نویسندگان
چکیده
We propose a simple unsupervised approach to detecting non-compositional components in multiword expressions based on Wiktionary. The approach makes use of the definitions, synonyms and translations in Wiktionary, and is applicable to any type of MWE in any language, assuming the MWE is contained in Wiktionary. Our experiments show that the proposed approach achieves higher F-score than state-of-the-art methods.
منابع مشابه
Salehi, Bahar, Paul Cook and Timothy Baldwin (to appear) Detecting Non-compositional MWE Components using Wiktionary, In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar
We propose a simple unsupervised approach to detecting non-compositional components in multiword expressions based on Wiktionary. The approach makes use of the definitions, synonyms and translations in Wiktionary, and is applicable to any type of MWE in any language, assuming the MWE is contained in Wiktionary. Our experiments show that the proposed approach achieves higher F-score than state-o...
متن کاملPredicting the Compositionality of Multiword Expressions Using Translations in Multiple Languages
In this paper, we propose a simple, languageindependent and highly effective method for predicting the degree of compositionality of multiword expressions (MWEs). We compare the translations of an MWE with the translations of its components, using a range of different languages and string similarity measures. We demonstrate the effectiveness of the method on two types of English MWEs: noun comp...
متن کاملAutomatic Identification Of Non-Compositional Multi-Word Expressions Using Latent Semantic Analysis
Making use of latent semantic analysis, we explore the hypothesis that local linguistic context can serve to identify multi-word expressions that have noncompositional meanings. We propose that vector-similarity between distribution vectors associated with an MWE as a whole and those associated with its constitutent parts can serve as a good measure of the degree to which the MWE is composition...
متن کاملConstruction of English MWE Dictionary and its Application to POS Tagging
This paper reports our ongoing project for constructing an English multiword expression (MWE) dictionary and NLP tools based on the developed dictionary. We extracted functional MWEs from the English part of Wiktionary, annotated the Penn Treebank (PTB) with MWE information, and conducted POS tagging experiments. We report how the MWE annotation is done on PTB and the results of POS and MWE tag...
متن کاملUFRGS&LIF at SemEval-2016 Task 10: Rule-Based MWE Identification and Predominant-Supersense Tagging
This paper presents our approach towards the SemEval-2016 Task 10 – Detecting Minimal Semantic Units and their Meanings. Systems are expected to provide a representation of lexical semantics by (1) segmenting tokens into words and multiword units and (2) providing a supersense tag for segments that function as nouns or verbs. Our pipeline rule-based system uses no external resources and was imp...
متن کامل